Note: if you use MAGeCKFlute in published research, please cite: Binbin Wang, Mei Wang, Wubing Zhang. “Integrative analysis of pooled CRISPR genetic screens using MAGeCKFlute.” Nature Protocols (2019), doi: 10.1038/s41596-018-0113-7.

Input data - results from MAGeCK

MAGeCK (T. X. Wei Li Han Xu and Liu. 2014) and MAGeCK-VISPR (H. X. Wei Li Johannes Köster and Liu. 2015) are developed by our lab previously, to analyze CRISPR/Cas9 screen data in different scenarios(Tim Wang 2014, Hiroko Koike-Yusa (2014), Ophir Shalem1 (2014), Luke A.Gilbert (2014), Silvana Konermann (2015)). Both algorithms use negative binomial models to model the variances of sgRNAs, and use Robust Rank Aggregation (for MAGeCK) or maximum likelihood framework (for MAGeCK-VISPR) for a robust identification of selected genes.

The command mageck mle computes beta scores and the associated statistics for all genes in multiple conditions. The beta score describes how the gene is selected: a positive beta score indicates a positive selection, and a negative beta score indicates a negative selection.

The command mageck test uses Robust Rank Aggregation (RRA) for robust identification of CRISPR-screen hits, and outputs the summary results at both sgRNA and gene level.

Quick start

Here we show the most basic steps for integrative analysis pipeline. MAGeCKFlute package includes the MAGeCK results from one intrinsic dataset, including countsummary, rra.gene_summary, rra.sgrna_summary, and mle.gene_summary. We will work with them in this document.

library(MAGeCKFlute)
library(ggplot2)

Downstream analysis pipeline for MAGeCK RRA

##Load gene summary data in MAGeCK RRA results
data("rra.gene_summary")
data("rra.sgrna_summary")
##Run the downstream analysis pipeline for MAGeCK RRA
FluteRRA(rra.gene_summary, rra.sgrna_summary, proj="PLX", organism="hsa")

All pipeline results are written into local directory “./MAGeCKFlute_PLX/”, and all figures are integrated into file “PLX_Flute.rra_summary.pdf”.

Downstream analysis pipeline for MAGeCK MLE CRISPR screens can be conducted using three different cell populations: the day 0 population, a drug-treated population (treatment) and a control population (mockdrug control, typically treated with a vehicle such as DMSO). The CRISPR screens are required to analyze with MAGeCK MLE, which generates a gene summary file like mle.gene_summary below. FluteMLE is useful in this scenario to perform downstream analysis.

data("mle.gene_summary")
## Run the downstream analysis pipeline for MAGeCK MLE
FluteMLE(mle.gene_summary, treatname="plx", ctrlname="dmso", proj="PLX", organism="hsa")

If CRISPR screens don’t include a control cell population, FluteMLE supports users to use Depmap screens as control.

## Take Depmap screen as control
FluteMLE(mle.gene_summary, treatname="plx", ctrlname="Depmap", proj="PLX", organism="hsa", incorporateDepmap = TRUE)
## If a specific cell line is preferred, you can look at the similarity of your screen with Depmap screens by using function ResembleDepmap.
depmap_similarity = ResembleDepmap(mle.gene_summary, symbol = "Gene", score = "plx.beta")
FluteMLE(mle.gene_summary, treatname="plx", ctrlname="Depmap", proj="PLX", organism="hsa", incorporateDepmap = TRUE, cell_lines = rownames(depmap_similarity)[1], lineages = "All")

All pipeline results are written into local directory “./MAGeCKFlute_PLX/”, and all figures are integrated into file “PLX_Flute.mle_summary.pdf”.

Section I: Quality control

** Count summary ** MAGeCK Count in MAGeCK/MAGeCK-VISPR generates a count summary file, which summarizes some basic QC scores at raw count level, including map ratio, Gini index, and NegSelQC. Use function ‘data’ to load the dataset, and have a look at the file with a text editor to see how it is formatted.

data("countsummary")
head(countsummary)
##                                   File    Label    Reads   Mapped
## 1 ../data/GSC_0131_Day23_Rep1.fastq.gz day23_r1 62818064 39992777
## 2  ../data/GSC_0131_Day0_Rep2.fastq.gz  day0_r2 47289074 31709075
## 3  ../data/GSC_0131_Day0_Rep1.fastq.gz  day0_r1 51190401 34729858
## 4 ../data/GSC_0131_Day23_Rep2.fastq.gz day23_r2 58686580 37836392
##   Percentage TotalsgRNAs Zerocounts GiniIndex NegSelQC NegSelQCPval
## 1     0.6366       64076         57   0.08510        0            1
## 2     0.6705       64076         17   0.07496        0            1
## 3     0.6784       64076         14   0.07335        0            1
## 4     0.6447       64076         51   0.08587        0            1
##   NegSelQCPvalPermutation NegSelQCPvalPermutationFDR NegSelQCGene
## 1                       1                          1            0
## 2                       1                          1            0
## 3                       1                          1            0
## 4                       1                          1            0
# Gini index
BarView(countsummary, x = "Label", y = "GiniIndex",
        ylab = "Gini index", main = "Evenness of sgRNA reads")

# Missed sgRNAs
countsummary$Missed = log10(countsummary$Zerocounts)
BarView(countsummary, x = "Label", y = "Missed", fill = "#394E80",
        ylab = "Log10 missed gRNAs", main = "Missed sgRNAs")

# Read mapping
MapRatesView(countsummary)

# Or
countsummary$Unmapped = countsummary$Reads - countsummary$Mapped
gg = data.table::melt(countsummary[, c("Label", "Mapped", "Unmapped")], id.vars = "Label")
gg$variable = factor(gg$variable, levels = c("Unmapped", "Mapped"))
p = BarView(gg, x = "Label", y = "value", fill = "variable", 
            position = "stack", xlab = NULL, ylab = "Reads", main = "Map ratio")
p + scale_fill_manual(values = c("#9BC7E9", "#1C6DAB"))

Section II: Downstream analysis of MAGeCK RRA

For experiments with two experimental conditions, we recommend MAGeCK-RRA for identification of essential genes from CRISPR/Cas9 knockout screens. Gene summary file in MAGeCK-RRA results summarizes the statistical significance of positive selections and negative selections.

library(MAGeCKFlute)
data("rra.gene_summary")
head(rra.gene_summary)
##       id num  neg.score neg.p.value  neg.fdr neg.rank neg.goodsgrna
## 1    NF2   4 4.1770e-12  2.9738e-07 0.000275        1             4
## 2 SRSF10   4 4.4530e-11  2.9738e-07 0.000275        2             4
## 3 EIF2B4   8 2.8994e-10  2.9738e-07 0.000275        3             8
## 4  LAS1L   6 1.4561e-09  2.9738e-07 0.000275        4             6
## 5   RPL3  15 2.3072e-09  2.9738e-07 0.000275        5            12
## 6 ATP6V0   7 3.8195e-09  2.9738e-07 0.000275        6             7
##   neg.lfc pos.score pos.p.value pos.fdr pos.rank pos.goodsgrna pos.lfc
## 1 -1.3580   1.00000     1.00000       1    16645             0 -1.3580
## 2 -1.8544   1.00000     1.00000       1    16647             0 -1.8544
## 3 -1.5325   1.00000     1.00000       1    16646             0 -1.5325
## 4 -2.2402   0.99999     0.99999       1    16570             0 -2.2402
## 5 -1.0663   0.95519     0.99205       1    15359             2 -1.0663
## 6 -1.6380   1.00000     1.00000       1    16644             0 -1.6380
data(rra.sgrna_summary)
head(rra.sgrna_summary)
##     sgrna  Gene control_count treatment_count control_mean treat_mean
## 1 s_10963 CDKN2 1175.4/1156.7     4110.7/4046      1166.00    4078.30
## 2 s_10959 CDKN2 651.49/647.25   2188.3/3020.6       649.37    2604.40
## 3 s_36798   NF2    8917/21204   5020.7/5127.9     15061.00    5074.30
## 4 s_45763 RAB6A 3375.8/3667.7   372.88/357.79      3521.80     365.33
## 5 s_23611  GPN1 4043.8/4064.2    767.53/853.7      4054.00     810.61
## 6 s_50164   SF1 3657.8/3352.6   453.62/628.28      3505.20     540.95
##       LFC control_var adj_var  score       p.low p.high  p.twosided
## 1  1.8055  1.7417e+02  4531.0 43.266  1.0000e+00      0  0.0000e+00
## 2  2.0022  8.9814e+00  2365.7 40.195  1.0000e+00      0  0.0000e+00
## 3 -1.5693  7.5491e+07 78871.0 35.559 2.9804e-277      1 5.9609e-277
## 4 -3.2655  4.2617e+04 15519.0 25.338 6.1638e-142      1 1.2328e-141
## 5 -2.3208  2.0966e+02 18159.0 24.069 2.6711e-128      1 5.3423e-128
## 6 -2.6937  4.6575e+04 15438.0 23.857 4.2365e-126      1 8.4731e-126
##           FDR high_in_treatment
## 1  0.0000e+00              True
## 2  0.0000e+00              True
## 3 1.2732e-272             False
## 4 1.9748e-137             False
## 5 6.8462e-124             False
## 6 9.0487e-122             False

Visualization of negative selections and positive selections

Read the MAGeCK-RRA results

Two type of scores are optional, lfc (logrithm fold change) and rra (Robust Rank Aggregation score).

dd.rra = ReadRRA(rra.gene_summary)
head(dd.rra)
##       id   Score      FDR
## 1    NF2 -1.3580 0.000275
## 2 SRSF10 -1.8544 0.000275
## 3 EIF2B4 -1.5325 0.000275
## 4  LAS1L -2.2402 0.000275
## 5   RPL3 -1.0663 0.000275
## 6 ATP6V0 -1.6380 0.000275
dd.sgrna = ReadsgRRA(rra.sgrna_summary)
head(dd.sgrna)
##     sgrna  Gene     LFC         FDR
## 1 s_10963 CDKN2  1.8055  0.0000e+00
## 2 s_10959 CDKN2  2.0022  0.0000e+00
## 3 s_36798   NF2 -1.5693 1.2732e-272
## 4 s_45763 RAB6A -3.2655 1.9748e-137
## 5 s_23611  GPN1 -2.3208 6.8462e-124
## 6 s_50164   SF1 -2.6937 9.0487e-122

Compute the similarity between the CRISPR screen with Depmap screens

depmap_similarity = ResembleDepmap(dd.rra, symbol = "id", score = "Score", lineages = "All", method = "pearson")
head(depmap_similarity)
##          estimate p.value
## A2780   0.5791483       0
## OVCAR8  0.5789564       0
## R262    0.5755967       0
## HS578T  0.5754257       0
## SF295   0.5751215       0
## KYSE180 0.5749847       0

Omit common essential genes from the data

dd.rra = OmitCommonEssential(dd.rra)
dd.sgrna = OmitCommonEssential(dd.sgrna, symbol = "Gene")
# Compute the similarity with Depmap screens based on subset genes
depmap_similarity = ResembleDepmap(dd.rra, symbol = "id", score = "Score", lineages = "All", method = "pearson")
head(depmap_similarity)
##          estimate       p.value
## TE11    0.2394741 5.392597e-149
## OVCAR8  0.2390556 6.636024e-154
## YH13    0.2386189 2.470833e-153
## HUPT3   0.2380013 3.869874e-147
## NCIH661 0.2379885 1.640237e-152
## LXF289  0.2360406 5.494129e-150

Volcano plot

dd.rra$LogFDR = -log10(dd.rra$FDR)
p1 = ScatterView(dd.rra, x = "Score", y = "LogFDR", label = "id", model = "volcano", top = 5)
print(p1)

# Or
p2 = VolcanoView(dd.rra, x = "Score", y = "FDR", Label = "id")
print(p2)

Rank plot

Rank all the genes based on their scores and label genes in the rank plot.

dd.rra$Rank = rank(dd.rra$Score)
p1 = ScatterView(dd.rra, x = "Rank", y = "Score", label = "id", 
                 top = 5, auto_cut_y = TRUE, groups = c("top", "bottom"))
print(p1)

# Label interested hits using parameter `toplabels` (in ScatterView) and `genelist` (in RankView).
ScatterView(dd.rra, x = "Rank", y = "Score", label = "id", auto_cut_y = TRUE, 
            groups = c("top", "bottom"), toplabels = c("EP300", "NF2"))

# Or
geneList= dd.rra$Score
names(geneList) = dd.rra$id
p2 = RankView(geneList, top = 5, bottom = 10) + xlab("Score")
print(p2)

RankView(geneList, top = 0, bottom = 0, genelist = c("EP300", "NF2")) + xlab("Score")

Dot plot

Visualize negative and positive selected genes separately.

dd.rra$RandomIndex = sample(1:nrow(dd.rra), nrow(dd.rra))
dd.rra = dd.rra[order(-dd.rra$Score), ]
gg = dd.rra[dd.rra$Score>0, ]
p2 = ScatterView(gg, x = "RandomIndex", y = "Score", label = "id",
                 y_cut = CutoffCalling(dd.rra$Score,2), 
                 groups = "top", toplabels = dd.rra$id[1:5])
p2

sgRankView - visualize the rank of sgRNAs targeting top selected genes.

p2 = sgRankView(dd.sgrna, top = 4, bottom = 4)
print(p2)

Enrichment analysis

For more information about functional enrichment analysis in MAGeCKFlute, please read the [MAGeCKFlute_enrichment document] (https://www.bioconductor.org/packages/3.9/bioc/vignettes/MAGeCKFlute/inst/doc/MAGeCKFlute_enrichment.html), in which we introduce all the available options and methods.

geneList= dd.rra$Score
names(geneList) = dd.rra$id
enrich = EnrichAnalyzer(geneList = geneList[geneList>0.5], 
                        method = "HGT", type = "KEGG")
EnrichedView(enrich, mode = 1, top = 5)

EnrichedView(enrich, mode = 2, top = 5)

Section III: Downstream analysis of MAGeCK MLE

** Gene summary ** The gene summary file in MAGeCK-MLE results includes beta scores of all genes in multiple condition samples.

data("mle.gene_summary")
dd=ReadBeta(mle.gene_summary)
head(dd)
##     Gene      dmso       plx
## 1   FEZ1 -0.045088 -0.036721
## 2    TNN  0.094325  0.065533
## 3  NAT8L  0.026362  0.044979
## 4   OAS2 -0.271210 -0.289010
## 5 OR10H3 -0.098324 -0.365730
## 6  CCL16 -0.309750 -0.148830

Normalization of beta scores

It is difficult to control all samples with a consistent cell cycle in a CRISPR screen experiment with multi conditions. Besides, beta score among different states with an inconsistent cell cycle is incomparable. So it is necessary to do the normalization when comparing the beta scores in different conditions. Essential genes are those genes that are indispensable for its survival. The effect generated by knocking out these genes in different cell types is consistent. Based on this, we developed the cell cycle normalization method to shorten the gap of the cell cycle in different conditions. Besides, a previous normalization method called loess normalization is available in this package.(Laurent Gautier 2004)

ctrlname = "dmso"
treatname = "plx"
dd_essential = NormalizeBeta(dd, samples=c(ctrlname, treatname), method="cell_cycle")
head(dd_essential)
##     Gene        dmso         plx
## 1   FEZ1 -0.05884165 -0.05661055
## 2    TNN  0.12309790  0.10102827
## 3  NAT8L  0.03440347  0.06934141
## 4   OAS2 -0.35393992 -0.44554929
## 5 OR10H3 -0.12831676 -0.56382388
## 6  CCL16 -0.40423616 -0.22944223
#OR
dd_loess = NormalizeBeta(dd, samples=c(ctrlname, treatname), method="loess")
head(dd_loess)
##     Gene        dmso         plx
## 1   FEZ1 -0.04259327 -0.03921573
## 2    TNN  0.09946182  0.06039618
## 3  NAT8L  0.03032953  0.04101147
## 4   OAS2 -0.27158198 -0.28863802
## 5 OR10H3 -0.09841158 -0.36564242
## 6  CCL16 -0.30981640 -0.14876360

Distribution of all gene beta scores

After normalization, the distribution of beta scores in different conditions should be similar. We can evaluate the distribution of beta scores using the function ‘DensityView’, and ‘ConsistencyView’.

DensityView(dd_essential, samples=c(ctrlname, treatname))

ConsistencyView(dd_essential, ctrlname, treatname)

# Another option MAView
MAView(dd_essential, ctrlname, treatname)

Positive selection and negative selection

dd_essential$Control = rowMeans(dd_essential[,ctrlname, drop = FALSE])
dd_essential$Treatment = rowMeans(dd_essential[,treatname, drop = FALSE])

p1 = ScatterView(dd_essential, "Control", "Treatment", groups = c("top", "bottom"), auto_cut_diag = TRUE, display_cut = TRUE, toplabels = c("NF1", "NF2", "EP300"))
print(p1)

dd_essential$Diff = dd_essential$Treatment - dd_essential$Control
dd_essential$Rank = rank(dd_essential$Diff)
p1 = ScatterView(dd_essential, x = "Diff", y = "Rank", label = "Gene", 
                 top = 5, model = "rank")
print(p1)

# Or
rankdata = dd_essential$Treatment - dd_essential$Control
names(rankdata) = dd_essential$Gene
RankView(rankdata)

Nine-square scatter plot - identify treatment-associated genes

p1 = ScatterView(dd_essential, x = "dmso", y = "plx", label = "Gene", 
                 model = "ninesquare", top = 5, display_cut = TRUE, force = 2)
print(p1)

# Or
p2 = SquareView(dd_essential, label = "Gene", 
                x_cutoff = CutoffCalling(dd_essential$Control, 2), 
                y_cutoff = CutoffCalling(dd_essential$Treatment, 2))
print(p2)

Functional analysis for treatment-associated genes

# 9-square groups
Square9 = p1$data
idx=Square9$group=="topcenter"
geneList = Square9$Diff
names(geneList) = Square9$Gene[idx]
universe = Square9$Gene

# Enrichment analysis
kegg1 = EnrichAnalyzer(geneList = geneList, universe = universe)
EnrichedView(kegg1, top = 6, bottom = 0)

Also, pathway visualization can be done using function KeggPathwayView, the same as the section above.

genedata = dd_essential[, c("Control","Treatment")]
arrangePathview(genedata, pathways = "hsa01521", organism = "hsa", sub = NULL)

Session info

sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.1.0     MAGeCKFlute_1.6.5
## 
## loaded via a namespace (and not attached):
##   [1] fgsea_1.8.0            colorspace_1.4-1       class_7.3-14          
##   [4] modeltools_0.2-22      ggridges_0.5.1         mclust_5.4.2          
##   [7] rprojroot_1.3-2        qvalue_2.14.0          XVector_0.22.0        
##  [10] ggpubr_0.2             farver_1.1.0           urltools_1.7.1        
##  [13] ggrepel_0.8.0          flexmix_2.3-14         bit64_0.9-7           
##  [16] mvtnorm_1.0-8          AnnotationDbi_1.44.0   xml2_1.2.0            
##  [19] codetools_0.2-15       splines_3.5.0          robustbase_0.93-3     
##  [22] GOSemSim_2.8.0         knitr_1.20             pathview_1.22.0       
##  [25] jsonlite_1.5           annotate_1.60.0        kernlab_0.9-27        
##  [28] cluster_2.0.7-1        GO.db_3.7.0            png_0.1-7             
##  [31] pheatmap_1.0.10        graph_1.60.0           ggforce_0.1.3         
##  [34] msigdbr_7.0.1          compiler_3.5.0         httr_1.3.1            
##  [37] rvcheck_0.1.1          backports_1.1.2        assertthat_0.2.0      
##  [40] Matrix_1.2-15          lazyeval_0.2.1         limma_3.38.2          
##  [43] tweenr_1.0.0           htmltools_0.3.6        prettyunits_1.0.2     
##  [46] tools_3.5.0            igraph_1.2.2           gtable_0.2.0          
##  [49] glue_1.3.0             reshape2_1.4.3         DO.db_2.9             
##  [52] dplyr_0.8.3            fastmatch_1.1-0        Rcpp_1.0.2            
##  [55] enrichplot_1.2.0       Biobase_2.42.0         trimcluster_0.1-2.1   
##  [58] Biostrings_2.50.2      nlme_3.1-137           fpc_2.1-11.1          
##  [61] ggraph_1.0.2           stringr_1.4.0          clusterProfiler_3.13.0
##  [64] XML_3.98-1.16          dendextend_1.9.0       DOSE_3.11.2           
##  [67] DEoptimR_1.0-8         europepmc_0.3          zlibbioc_1.28.0       
##  [70] MASS_7.3-51.1          scales_1.0.0           hms_0.4.2             
##  [73] parallel_3.5.0         KEGGgraph_1.42.0       RColorBrewer_1.1-2    
##  [76] yaml_2.2.0             memoise_1.1.0          gridExtra_2.3         
##  [79] UpSetR_1.3.3           biomaRt_2.38.0         triebeard_0.3.0       
##  [82] stringi_1.2.4          RSQLite_2.1.1          genefilter_1.64.0     
##  [85] S4Vectors_0.20.1       BiocGenerics_0.28.0    BiocParallel_1.16.2   
##  [88] matrixStats_0.54.0     rlang_0.4.5            pkgconfig_2.0.2       
##  [91] prabclus_2.2-6         bitops_1.0-6           evaluate_0.12         
##  [94] lattice_0.20-38        purrr_0.2.5            labeling_0.3          
##  [97] cowplot_0.9.3          bit_1.1-14             tidyselect_0.2.5      
## [100] ggsci_2.9              plyr_1.8.4             magrittr_1.5          
## [103] R6_2.3.0               IRanges_2.16.0         DBI_1.0.0             
## [106] withr_2.1.2            mgcv_1.8-26            pillar_1.4.2          
## [109] whisker_0.3-2          units_0.6-1            survival_2.43-3       
## [112] KEGGREST_1.22.0        RCurl_1.95-4.11        nnet_7.3-12           
## [115] tibble_2.1.3           crayon_1.3.4           rmarkdown_1.10        
## [118] viridis_0.5.1          progress_1.2.0         grid_3.5.0            
## [121] sva_3.30.0             data.table_1.11.8      blob_1.1.1            
## [124] Rgraphviz_2.26.0       diptest_0.75-7         digest_0.6.18         
## [127] xtable_1.8-3           tidyr_0.8.2            gridGraphics_0.3-0    
## [130] stats4_3.5.0           munsell_0.5.0          viridisLite_0.3.0     
## [133] ggplotify_0.0.3

References

Hiroko Koike-Yusa, E-Pien Tan, Yilong Li. 2014. “Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library.” http://science.sciencemag.org/content/343/6166/80.long.

Laurent Gautier, Benjamin M. Bolstad, Leslie Cope. 2004. “affy—analysis of Affymetrix GeneChip data at the probe level.” https://academic.oup.com/bioinformatics/article/20/3/307/185980.

Luke A.Gilbert, BrittAdamson, Max A.Horlbeck. 2014. “Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation.” https://linkinghub.elsevier.com/retrieve/pii/S0092-8674(14)01178-7.

Ophir Shalem1, *, 2. 2014. “Genome-scale CRISPR-Cas9 knockout screening in human cells.” http://science.sciencemag.org/content/343/6166/84.long.

Silvana Konermann, Alexandro E. Trevino, Mark D. Brigham. 2015. “Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex.” https://www.nature.com/nature/journal/vnfv/ncurrent/full/nature14136.html.

Tim Wang, David M. Sabatini, Jenny J. Wei1. 2014. “Genetic Screens in Human Cells Using the CRISPR-Cas9 System.” http://science.sciencemag.org/content/343/6166/80.long.

Wei Li, Han Xu, Johannes Köster, and X. Shirley Liu. 2015. “Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0843-6.

Wei Li, Tengfei Xiao, Han Xu, and X Shirley Liu. 2014. “MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0554-4.